Pipelining and Overlapping for MPI Collective Operations
نویسنده
چکیده
Collective operations are an important aspect of the currently most important message-passing programming model MPI (Message Passing Interface). Many MPI applications make heavy use of collective operations. Collective operations involve the active participation of a known group of processes and are usually implemented on top of MPI point-to-point message passing. Many optimizations of the used communication algorithms have been developed, but the vast majority of those optimizations is still based on plain MPI point-to-point message passing. While this has the advantage of portability, it often does not allow for full exploitation of the underlying interconnection network. In this paper, we present a low-level, pipeline-based optimization of one-to-many and many-to-one collective operations for the SCI (Scalable Coherent Interface) interconnection network. The optimizations increase the performance of some operations by a factor of four if compared with the generic, tree-based algorithms.
منابع مشابه
Sparse Non-blocking Collectives in Quantum Mechanical Calculations
For generality, MPI collective operations support arbitrary dense communication patterns. However, in many applications where collective operations would be beneficial, only sparse communication patterns are required. This paper presents one such application: Octopus, a production-quality quantum mechanical simulation. We introduce new sparse collective operations defined on graph communicators...
متن کاملMPI collectives at scale
Collective operations improve the performance and reduce code complexity of many applications parallelized with the messagepassing interface (MPI) paradigm. In this article, we will investigate the impact of load imbalance on the performance of collective operations and possibility for hiding parallel overhead caused by a collective communication pattern, by overlapping the communication with c...
متن کاملDesign of Scalable PGAS Collectives for NUMA and Manycore Systems
The increasing number of cores per processor is turning multicore-based systems in pervasive. This involves dealing with multiple levels of memory in NUMA systems, accessible via complex interconnects in order to dispatch the increasing amount of data required. The key for efficient and scalable provision of data is the use of collective communication operations that minimize the impact of bott...
متن کاملD RA FT 10 / 1 4 / 20 08 Non - Blocking Collective Operations for MPI - 3 The MPI - 3 Collective Operations
We propose new non-blocking interfaces for the collective group communication functions defined in MPI1 and MPI-2. This document is meant as a standard extension and written in the same way as the MPI standards. It covers the MPI-API as well as the semantics of the new operations.
متن کاملA Case for Standard Non-blocking Collective Operations
In this paper we make the case for adding standard nonblocking collective operations to the MPI standard. The non-blocking point-to-point and blocking collective operations currently defined by MPI provide important performance and abstraction benefits. To allow these benefits to be simultaneously realized, we present an application programming interface for non-blocking collective operations i...
متن کامل